Audiovisual representation of prosody in expressive speech communication
نویسندگان
چکیده
Prosody in a single speaking style – often read speech – has been studied extensively in acoustic speech. During the past few years we have expanded our interest in two directions: 1.) Prosody in expressive speech communication and 2.) Prosody as an audiovisual expression. Understanding the interactions between visual expressions (primarily in the face) and the acoustics of the corresponding speech presents a substantial challenge. Some of the visual articulation is for obvious reasons tightly connected to the acoustics (e.g. lip and jaw movements), but there are other articulatory movements that do not show up on the outside of the face. Furthermore, many facial gestures used for communicative purposes do not affect the acoustics directly, but might nevertheless be connected on a higher communicative level in which the timing of the gestures could play an important role. In this presentation we will give some examples of recent work, primarily at KTH, addressing these questions. We will report on methods for the acquisition and modeling of visual and acoustic data, and some evaluation experiments in which audiovisual prosody is tested. The context of much of our work in this area is to create an animated talking agent capable of displaying realistic communicative behavior and suitable for use in conversational spoken language systems, e.g. a virtual language teacher.
منابع مشابه
Expressive Speech Recognition and Synthesis as Enabling Technologies for Affective Robot-Child Communication
This paper presents our recent and current work on expressive speech synthesis and recognition as enabling technologies for affective robot-child interaction. We show that current expression recognition systems could be used to discriminate between several archetypical emotions, but also that the old adage ”there’s no data like more data” is more than ever valid in this field. A new speech synt...
متن کاملAn Acoustic Study of Emotivity-Prosody Interface in Persian Speech Using the Tilt Model
This paper aims to explore some acoustic properties (i.e. duration and pitch amplitude of speech) associated with three different emotions: anger, sadness and joy against neutrality as a reference point, all being intentionally expressed by six Persian speakers. The primary purpose of this study is to find out if there is any correspondence between the given emotions and prosody patterning in P...
متن کاملProsody and its relationship to language in school-aged children with high-functioning autism.
BACKGROUND Disordered expressive prosody is a widely reported characteristic of individuals with autism. Despite this, it has received little attention in the literature and the few studies that have addressed it have not described its relationship to other aspects of communication. AIMS To determine the nature and relationship of expressive and receptive language, phonology, pragmatics, and ...
متن کاملMeLos: Analysis and Modelling of Speech Prosody and Speaking Style
This thesis addresses the issue of modelling speech prosody for speech synthesis, and presents MeLos: a complete system for the analysis and modelling of speech prosody “the music of speech”. Research into the analysis and modelling of speech prosody has increased dramatically in recent decades, and speech prosody has emerged as a crucial concern for speech synthesis. The issue of speech prosod...
متن کاملModeling of Fundamental Frequency Contour of Thai Expressive Speech using Fujisaki’s Model and Structural Model
Problem statement: In spontaneous speech communication, prosody is an important factor that must be taken into account, since the prosody effects on not only the naturalness but also the intelligibility of speech. Focusing on synthesis of Thai expressive speech, a number of systems has been developed for years. However, the expressive speech with various speaking styles has not been accomplishe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Speech Communication
دوره 46 شماره
صفحات -
تاریخ انتشار 2005